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Abstract. Using the square-root map p y/p a probability density function p 
can be represented as a point of the unit sphere S in the Hilbert space of square- 
Cji ' integrable functions. If the density function depends smoothly on a set of parameters. 



the image of the map forms a Riemannian submanifold 9Jl C 5. The metric on VJl 
induced by the ambient spherical geometry of S is the Fisher information matrix. 
Statistical properties of the system modelled by a parametric density function p can 
then be expressed in terms of information geometry. An elementary introduction to 
information geometry is presented, followed by a precise geometric characterisation 
. of the family of Gaussian density functions. When the parametric density function 

describes the equilibrium state of a physical system, certain physical characteristics 
can be identified with geometric features of the associated information manifold 9Jl. 
Applying this idea, the properties of vapour-liquid phase transitions are elucidated in 
Q ' geometrical terms. For an ideal gas, phase transitions are absent and the geometry 

O . of OJl is flat. In this case, the solutions to the geodesic equations yield the adiabatic 

equations of state. For a van der Waals gas, the associated geometry of OJl is highly 
nontrivial. The scalar curvature of dJl diverges along the spinodal boundary which 
envelopes the unphysical region in the phase diagram. The curvature is thus closely 
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■ 1. Statistical geometry 

This paper is an overview of tlie information-geometric description of vapour-liquid 
phase transitions in equilibrium statistical mechanics. The present section begins with 
a reasonably self-contained account of the relevant background material on information 
geometry. As an illustrative example we shall examine in some detail the geometry of 
the space of Gaussian density functions. The relation between the information measure 
of Fisher and that of Shannon and Wiener is also briefly discussed. In later sections 
these ideas are applied to the information-geometric characterisation of the equilibrium 
properties of noninteracting and interacting gas molecules. The relevant references are 
provided in the bibliographical notes in Section IH where we also provide a brief and 
perhaps incomplete history of information geometry. 
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1.1. From probability to geometry 

The concept of 'information geometry' is a simple one which emerged from an attempt 
to discriminate among different probabihties in statistical analysis. The idea can be 
sketched as follows. Let {pi}i=i,2,...,N denote a set of probabilities satisfying 

N 

0<Pi<l and ^p, = l. (1) 

1=1 

We introduce the following square-root map: 

Pi^ii = ^/Pi- (2) 
By construction, the square-root probabilities {^j} satisfy the normalisation condition 

N 

1=1 

If we regard the variables {^j}j=i,2,...,Af as the coordinates of a vector in an A^-dimensional 
Euclidean space R^, then the normalisation condition ([3]) implies that the endpoint 
of the vector {^j} lies on the unit sphere S in R^. Now suppose that {?7j}i=i,2,...,Ar 
corresponds to a second set of square-root probabilities. Then the vector {r/j} also lies 
on the unit sphere in R^. Hence, we can measure the relative separation or overlap of 
two sets of probabilities in terms of the angle 

N 

(J) = COS"^ ^ ^^T]^ (4) 
i=l 

between the associated square-root probability vectors. The angular separation (f) clearly 
vanishes if {^i} and {rji} are equal. Conversely, if and {rji} are orthogonal then 
achieves its maximum value ^vr. The angular separation defined in (jl]) is known 
as the Bhattacharyya spherical distance. Note that the cosine square of the spherical 
distance resembles the transition probability in quantum mechanics modelled on a finite- 
dimensional Hilbert space. 

We turn to the notion of so-called statistical geometry, which arises from the 
embedding of probability density functions in the Hilbert space of square-integrable 
functions. In probability theory one typically deals with a probability density function 
p{x) on, say, the real line R. For the function p : x —>■ p{x) to represent the density of 
some random variable X we require that p{x) > for all x G R and that J-^p{x)dx = 1. 
If we consider the square-root map 

p{x) ^{x) = ^/p{x) , (5) 

then the function ^(x) defined in this way belongs to the space Ti. = L^(R) of square- 
integrable functions on the real line. In other words, we embed the density functions in 
Hilbert space via the square-root map ([5]). In particular, since the square-root density 
functions satisfy the normalisation condition 



axfdx = 1, (6) 



R 
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Figure 1. Bhattacharyya's spherical distance. Two vectors ^i(a;) and ^2(2^), 
corresponding to a pair of probability density functions Pi{x) and P2{x), lie on the 
surface of the positive orthant of the unit sphere S in Hilbert space. The spherical 
distance between two unit vectors is given by the angle (j) defined in equation ([7|. 

the images of the map he on the unit sphere S G Ti. 

The advantage of working in Hilbert space Ti rather than the space of density 
functions is that ?^ is a vector space endowed with various geometric features that are 
famihar from other branches of physics, such as quantum mechanics or general relativity. 

Suppose we have a pair of density functions pi(x) and P2{,x), and wish to compare 
the overlap or separation of these two density functions. If the associated Hilbert space 
vectors are given respectively by ^i(x) and ^2(2;), then the overlap is measured in terms 
of the inner product f-^^i{x)^2{x)dx. Since ||^i|| = ||^2|| = 1, i-e. both vectors have unit 
norm, this overlap is given by the cosine of the angular separation. It follows that the 
Bhattacharyya spherical distance between two square-root density functions is 

= cos""^ / ^i{x)^2{x)dx. (7) 
Jr 

This idea is illustrated schematically in Fig. [H 
1.2. Parametric density and Fisher-Rao geometry 

In theoretical statistics one typically deals with a parametric family of probability 
density functions pe{x) = p{x\6). Here 6 denotes one or more real parameters. For 
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example, a Gaussian density function is characterised by two parameters, i.e. the mean 
fi and the variance a^. For each value, or set of values, of 6 we have the normalisation 
condition J-^pg{x)dx = 1. 

In problems of statistical inference, it is often convenient to consider the log- 
likelihood function 

leix) = \np{x\e). (8) 
However, in physics it is more natural to work with the square-root density function 

Ux) = ^/MO), (9) 

since, as indicated above, this permits formulation of the problem in a real Hilbert-space 
context. As before, for each given 6 the density function is mapped to a point on the 
unit sphere S C Hhj the prescription ([9]). If the value of 9 is changed, the image under 
the map in general also varies on S. We assume that the density function is at least 
twice differentiable with respect to the parameters. Then as the parameters change 
continuously, the image point on S will vary smoothly over a parametric subspace DJl 
of the sphere S. 

Given a parametric subspace 971 C 5 the metric of the ambient sphere S induces 
a Riemannian metric on the subspace in the usual way. This can be seen as follows. 
Recall that the Hilbert space inner product is defined by 



{^,V)= / axMx)dx. (10) 

Therefore, if we set 

^{x)=^eix) and r]{x) = ^eix) + dSix)de\ (11) 

where di = d/d6^, then the squared distance ds^ of the difference vector ^(x) — ri{x) is 
given by 

ds^ = (^j dS{x)dj^e{x)dx^ dO'dO^. (12) 

Before we proceed further with the derivation of the metric, let us introduce the 
statistical notion of the Fisher information matrix, which is usually defined by 

Gij = / pe{x) dile{x) djle{x) dx, (13) 

where l0{x) is the log-likelihood density ([8]). The Fisher information matrix is important 
in statistics because it provides a lower bound for the variance of a parameter 
estimate. Consider, for example, the case of a one-parameter family of density functions. 
That is, we have a density function p0{x) that depends upon a single unknown parameter 
6. The objective is thus to estimate the parameter by performing observations. If T{x) 
is an unbiased estimator for 6, i.e. if the expectation of T{x) with respect to peix) yields 
9, then we have 

{T{x) -e)^e{xydx = 0. (14) 

R 
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Differentiating with respect to 6 we obtain 

iT{x)-e)Ux)deUx)dx=^, (15) 
where dg = d/dd. By the Schwarz inequality 

^(T(x) - 6) G(x) dgU^) dx^ < (^lj.nx) - dx^ ih^'^'^''^^' '^'') ^^^^ 

we thus find 

Note that the left side is the variance of the estimator, whereas the denominator of the 
right side is the one-parameter form of the Fisher information matrix. The relation (ITTIl 
provides a lower bound for the variance, and is known as the Cramer-Rao inequality. 
The inequality in ( fT6l) is attained only when the two vectors are proportional, that 
is, dg^e{x) = c(T{x) — 9) ^g{x) for some constant c. By scaling 9 we can set c = | 
without loss of generality. Hence, the lower bound of the variance is attained only if the 
square-root density assumes an exponential form: 

ieix) = ^ ^ 18 

(/^exp(eT(a;))da;)^/^ 

The exponential family f|T8|) plays an important role in the applications to statistical 
mechanics considered below. 

In a multi-parameter context the reciprocal of the Fisher information matrix 
determines lower bounds for the variance in an analogous manner. From the geometrical 
viewpoint, the significance of the Fisher information matrix is that it defines the induced 
Riemannian metric on the parametric subspace 9Jl of the unit sphere S in Ti. Specifically, 
comparing ( fT2l) and ( |T3l) we see that 

ds^ = lGijd9'd9^. (19) 

The metric | Gij on 071 will be referred to as the Fisher-Rao metric. The factor of a 
quarter is purely a matter of convention, and the Fisher-Rao metric is thus given by a 
quarter of the Fisher information matrix. 



1.3. Riemannian structure of the exponential family 

We introduce here some elementary concepts in Riemannian geometry that are relevant 
to the ensuring discussion. We note first that all equilibrium distributions that we 
consider here are represented in the exponential form 

pe{x) = g(x)exp ^ 0*i/,(x) - ^(0) j , (20) 

where {^*}i=i,2,... are parameters, q{x) represents the prescribed equilibrium state at 
9'^ = for all i, and the functions {Hi{x)}i=i^2,... determine the form of the energy. In 
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other words, we shall only consider equilibrium states represented in the canonical form. 
The parameters {^*} may include inverse temperature, chemical potential, pressure, 
magnetic field, and so on, whereas the functions Hi{x) may represent system energy, 
particle number, system volume, magnetisation, and so on. The variable x ranges over 
the phase space F of the system. The function 

^{6) =\nj exp g(a;)da; (21) 

determines the overall normalisation. We refer to V'(^) ^is the thermodynamic potential 
of the system. It should be evident by inspection that 

_ J^Hi{x)exp{-J2iO'H,{x))q{x)dx 
Oe^ ~ exp (- ^. e^Hi{x)) q{x)dx 

Hi{x)pe{x)dx. (22) 



In other words, the first derivative of ipi^^) with respect to 6^ determines the expectation 
value of Hi{x) in the equilibrium state (120|) . As we shall indicate below, analogous 
calculations show that higher derivatives of the thermodynamic potential ip{6) determine 
higher moments of the functions {Hi{x)}. 

If the equilibrium density function assumes the form ( 120|) . then the expressions 
for the corresponding Fisher-Rao metric and the coefficients of the associated metric 
connection on the statistical manifold OK are simplified. Let us state the results first. 

Proposition 1 For a density function of the exponential form (1201) the Fisher-Rao 
metric Gij and the Christoffel symbols Tijk = GuV^^f^ are given, respectively, by 

Gij = didj^ie) (23) 

and 

Ti,k = ldidjdkij{9), (24) 
in terms of the canonical parametrisation {6**} on DJl, where di = d/dOi. 

The Christoffel symbols characterise the geodesies on DJl. Specifically, to find the 
shortest path from a to 6 on OJt we consider the variational problem: 

5 [ ds = 0. (25) 



From (fTO!) we find 

dG 

A5ds^ = dO'de^— ^59^ + 2Gikd9'dS9K (26) 

d9'- 

Bearing in mind that the right side of (l26l) equals 8ds 6ds, we obtain 

^''\d9^d9'dG,, , d9^d69n 

^dJd^^^'+^^'^d^^ ^^ = °- ^''^ 



Information geometry in vapour-liquid equilibrium 7 

Integrating the second term in the integrand by parts and writing -u* = d6'Yds we see 
that fl271) reduces to 



'W ds 



59^ds = 0. (2^ 



Since this must hold for arbitrary 66 we have 



1 i k^^ik d 



de^ ds 

Writing 



- 77 iGau^) = 0- (29) 



Inim I 9Gmk dGml dGkl 



for the Christoffel symbol we find that fl29|) can be expressed in the form 



This is the geodesic equation that determines the shortest paths on dJt. Owing to their 
nonlinearity, geodesic equations do not generally admit elementary analytic solutions, 
although in some cases one can solve (15T]) in closed form, as in the Gaussian example 
discussed below. 

Proof of Proposition Ql From fl20|) we have 

dM^) = - (Hiix) + d^^ie)) . (32) 

On the other hand, differentiating the normalisation condition 

P0{x)dx = 1 (33) 

once with respect to 9^ and using dipe{x) = pe{x)dil0{x) we obtain 

Pe{x){Hi{x)+dii,{e))dx = Q, (34) 



whence it follows that the expectation of Hi{x) with respect to pe{x) is given by —diip{9), 
as shown in fl2^ . Differentiating flMl) with respect to 6^ , we find that 

Pe{x) {Hi{x) + diipie)) {Hj{x) + djip{e)) dx + didj-^iO) = 0. (35) 

In view of fl32l) and f|T3|) . this implies that the Fisher- Rao metric Gij is given by fl23|) . 
From fl5Ul) we have 

^iki = \ {diGik + dkGii — diGki) , (36) 

but since the metric is given by (125]) we immediately deduce the expression for the 
Christoffel symbols. □ 

It is important to note that if we choose an alternative parametrisation for 971, then 
the components of the metric tensor and the Christoffel symbol cannot be calculated 
using the simple expressions given in Proposition [H and we must use the defining 
equations (IT^ and (1501) to determine these quantities. Also note that the metric tensor 
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is the covariance matrix of the functions {ifj(x)}, whereas the components of Tij^ are 
third-order cross- moments of {Hi{x)}. 

In terms of the Christoffel symbols F^^ the Riemann curvature tensor R^ij^ can be 
expressed as 

^\jk — ^k^\j ^ ^j^\k ~ ^)h^^lk + ^\h^^lj- (37) 

The Riemann tensor encodes the local geometry of DJl, and is related to the parallel 
transport of vectors on DJl. In particular, if we define the covariant derivative by 

8 A ■ 

V,^. = ^ - r%A,, (38) 

then the commutator of the covariant derivatives defines the Riemann tensor: 

VjVkA, - VkVjA, = AiR\jk. (39) 

The symmetry properties of the Riemann tensor can be derived by lowering the index 
with the metric and writing Rijki = GimR"jki- Specifically, this is given by 

- 2 [dOJdQk + 9Q^9Ql QQjQQl QQ^QQk ) + \^ jk^ ^l jl' ' ^^^^ 

whence it follows that Riju = RkUy Along with the relation R!" = —R'ju. that follows 
from fl37|) we find that Rijki = —Rijik = —Rjiki- Therefore, the components of Rijki 
vanish when i = j or k = I. In the case of a two-dimensional manifold DJl the only 
nonvanishing components of the Riemann tensor are given by 

-R1212 = ~-Rl221 = ""-^2112 = -^2121- (41) 

In other words, -R1212 is the sole independent component in two dimensions. 

Given a Riemann tensor R^ j^i we define the associated Ricci tensor by the symmetric 
expression 

R,i = R%i. (42) 

Equivalently, we can write Rji = G'^'' Rijki- A further contraction with the metric defines 
the scalar curvature: 

R = G^^Rji. (43) 

We call fH3|) the Ricci scalar curvature. If the Ricci tensor Rji is proportional to the 
metric tensor Gji then the manifold DJl is called an Einstein space. This is because the 
metric of such a space satisfies the vacuum Einstein equation 

R\ - \5\R = 0, (44) 

where R^j^ = G^^Rik. The significance of the Einstein equation in statistics or statistical 
mechanics can be seen if we relax the normalisation condition on the square-root density 
function ^(x) and thus eliminate the physically irrelevant degree of freedom associated 
with the norm ||^(x)||. Specifically, if T{x) represents an observable function on the 
phase space F then its expectation value in the generic 'state' ^(x) is 
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Evidently, the expectation so defined is invariant under the scale transformation 
C,{x) — > A^(x), where A is an arbitrary nonzero number. Thus, all the relevant statistical 
information is encoded in the direction of the vector ^(x) G Ti, irrespective of its length. 
Via the identification ^(x) ~ A^(x) we obtain a space of rays in Ti, which is known as 
the real projective Hilbert space. Suppose now that we consider the Einstein equation 
dH]) for the metric on the projective Hilbert space. Then there is a unique solution 
which is induced by the infinitesimal form of the Bhattacharyya spherical distance ([7]). 
Specifically, this is obtained by setting ^i(x) = ^(x) and ^2{x) = ^{x) + d^(x) in 

cos^ = (/E^i(^)^2(x)dx)^ ^^g^ 

(/r ^1(3^)6(3^)^3;) (/]R6(a;)6(a;)dx) ' 
Taylor expanding each side, and retaining terms of quadratic order. Then with the 
notation of (|T0|) we can write 

which defines a metric on the projective Hilbert space. It follows that the Einstein 
equation uniquely determines the probabilistic properties of the space of densities. 

For a statistical manifold DJl associated with a distribution of the exponential type 
f l20|) . the Riemann tensor assumes a simple form because the first two terms in fl37j) 
cancel, and only the contractions of the Christoffel symbols remain. 

The examples of statistical mechanical systems considered here are parameterised 
by a pair of external variables, so that the statistical manifold dJl is two-dimensional. 
In this case, the expression for the scalar curvature admits further simplifications. 
Specifically, we have the following: 

Proposition 2 In terms of the canonical parametria ation (6^,6^), the scalar curvature 
of a two-dimensional statistical model corresponding to the density function ( l20l) is given 
by the determinant 

Ai{0) A2{0) ^M0) 

R=-^ opinio) V'ii2(^) ^i22{e) , (4^ 

^112(0) 1pl22{0) V'222(^), 



where G = det(Gjj) is the determinant of the Fisher-Rao metric, and where ipuid) = 

d^'ip{e)/de^de\ tpiuio) = d^'4j{e)/de^de^de\ and so on. 

Proof. Recall that the scalar curvature is defined by the contraction 

Rijkl ij^ kmi^ jln ^kmj^iln) Gr . (49) 

Substituting fl24l) and the inverse of fl23l) in fH9|) . we obtain fHHl) after some rearrangement 
of terms. □ 
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1.4- Geometry of Gaussian distributions 

As an elementary illustrative example, consider the Gaussian (normal) distribution 
N{fi, cr) on the real line R with mean /z and variance a > 0. For the parameterised 
density function we have 

1 / (x — 

p{x\fi, a) = — =^ exp — — . (50) 



The normal density function can be rewritten in the canonical form 

p{x\e) = exp[-e^x^ -e^x -ipie)] , (51) 

where 

By differentiating ip{6) with respect to the parameters {6^} we can determine the 
components of the metric tensor Gij{6) in the coordinate system specified by the 
canonical parametrisation {6^}. 

Alternatively, we may regard the mean /i and variance a as coordinates on the 
statistical manifold 9Jl. In terms of the parameters (/x, a) the metric does not admit a 
simple representation (1231) . and we must perform the Gaussian integration in the defining 
relation f|T3|) . The line element then becomes 

ds^ = ^(dfi^ + 2d(T^) (53) 

which is defined on the upper-half plane — oo < fi < oo and < cr < oo. Since the 
metric Gij is diagonal in these coordinates, it is easily inverted and we obtain 

(54) 

A short calculation shows that the Christoffel symbols are given by 

^12 = = —'2T^i = = 7 = ^22 — ^12 — = 0. (55) 

cr 

Since the inverse of the metric tensor is diagonal, we need only determine the 

diagonal components of the Ricci tensor in order to calculate the scalar curvature (that 

is, the off-diagonal elements of the Ricci tensor vanish). These are 

1 1 
-Rii = -TTT R22 = 5", (56) 

respectively. Hence, the resulting geometry is that of a hyperbolic space, which is a 
homogeneous manifold of constant negative curvature: 

R=-l. (57) 

This space has many interesting properties. For example, the geodesic equations 
( I3TI) characterising trajectories of shortest paths on 971 are determined by the equations 

^-24-^^ = (58) 
ds^ ois) ds ds 
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^ + -(^) }=0- (59) 



and 



Since a > we can divide (1551) and by a, obtaining 



^)'-^^ = and f^)' + l^^ = 0, (60) 





where the prime indicates d/ds. It follows that 

((^)^-(7)7-' 



and hence that 



(61) 



+ 2 - =v\ (62) 
(T / V cr / 

where f > is a constant. On the other hand, if we define X = fi'/a then from the 
first equation of ( l60l) we have Xa' — X'a = 0, or equivalently (X/a)' = 0, and thus 
/i'/cr = ccr, where c is a constant. Substituting this in (l62l) we deduce that 

/n-'\ 2 

^2 2 I / ^ \ „,2 



cV+(^_j =^2^ (63) 

where we have rescaled the integration constants, i.e. c y^c and v — > -\/2'y. There 
are now two cases to consider, depending on whether c is zero or nonzero. 

If c = 0, then fi is constant, and a' = vcr, so that a{s) = ae"'^ for some constants a 
and V such that a > and f > 0. This represents a straight line parallel to the a axis in 
the /i-o" plane. If c 7^ 0, then ca <v from (163!) . hence (t(s) = ^sin7(s) for some 7(s). 
Substituting this into fl63l) we obtain 

=t;2sin2 7(s). (64) 

ds / 

Since 7' 7^ 0, 7(3) is monotonic and thus invertible. We may assume, without loss of 
generality, that 7' > so that flM|) implies d7/ds = w sin7. Using a{s) = ^ sin 7(5) we 
find that 

^' ldad7 

— = — - — — = i;cos7, (65) 
a cr d7 ds 



and further, using (l62l) with rescaled c and v we obtain 

/i'(s) = f (j(s) sin7(s). (66) 
If we regard s = 5(7) as parameterised by 7 we can write 

f^i^) = [ ~r'^l= [ /^':5~cl7 = - [ sin7d7 = — cos 7 + 6, (67) 



d7 7 d7 c J c 

where b is an integration constant. 

To summarise, if we regard 7(5) as the independent parameter, then the solutions 
to the geodesic equations for the Gaussian family of densities fl50l) are 

V V 

n{s) = — cos7(s) + 6 and a(s) = - sin7(s). (68) 
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Figure 2. Geodesic curves for Gaussian distributions. The statistical manifold 9Jt in 
this case is the upper half plane parameterised by /i and a. We have — oo < fi < oo 
and < fj < oo. The shortest path joining the two normal distributions -/V(/xi, ai) and 
N{pl2t<J2) is given by the unique semi-circular arc through the given two points and 
centred on the boundary line a — 0. 



These equations represent half-circles on the ^i-a plane centred on the axis (a = 0) 
with radius v/c. In Figure [2] we sketch examples of geodesic curves for the Gaussian 
family, for c 7^ 0. 

If follows from the solutions to the geodesic equations that the separation of a pair 
of normal distributions N{fii, ai) and iV(/i2, C2) is given by 



D{pi,p2) 



1 , l + 5i,2 



where the function 61^2 defined by 

Sl,2 = 



(/i2-/xi)2 + 2(a2-ai)2 



(69) 



(70) 



(/i2 - /il)2 + 2(^2 + ai)2 

lies between and 1. These results follow directly from the fact that the geodesies are 
semi-circular arcs centred on the boundary line a = (this line itself is not part of the 
manifold DJl because a > 0). In the exceptional case when /ii = fi2, the geodesic is a 
straight line /i = constant, and 



D{pi,p2) 



1 

7! 



log 



0-2 



(71) 



The above example illustrates how various geometric aspects of a statistical 
manifold 971 can be investigated in a systematic manner. It is interesting to note, 
in particular, that the Gaussian distributions define an elementary hyperbolic geometry 
with constant negative curvature. Before examining various geometric characterisations 
of ideal and interacting gases in thermal equilibrium, let us discuss the relation between 
the Fisher-Rao distance measure and various measures of entropy, a topic of some 
interest . 
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1.5. From entropy to Fisher information 

We have observed how the notion of information geometry arises from the Fisher 
information matrix commonly used in statistical analysis. On the other hand, the 
term 'information' often suggests the concept of entropy, rather than the Fisher matrix. 
Indeed, the entropy concept is essential in thermodynamics and statistical mechanics. 
Therefore, it would be appropriate to clarify the interrelation between these two notions 
of information. 

We shall discuss entropy in a fairly general context, and consider as before a 
parametric family of probability density functions p{x\9) which we assume to be defined, 
say, on the real hue R, where 9 — {0'}. Then with respect to any twice-differentiable 
concave function ip{p) we define the associated entropy functional by the expression 

HM - I ^\p{x\m^- (72) 

Now, if / represents a vector in the tangent space (at p) of the manifold of density 
functions, then the derivative of the entropy if^ at p in the direction / is defined by 

di/^(p;/)= ^H^{p + sf) 

^'\p{x)]f{x)dx, (73) 

R 

where (/^'(p) = d(/9(p)/dp. Similarly, if / and g are two vectors in the tangent space at 
p, we define the Hessian of H^p by 

dX(P;/,^)= / ^"W)]f{x)g{x)dx. (74) 
Jr 

The corresponding quadratic form is 

A///^(p) = 4dX(p;/,/), (75) 
or equivalently, 

AfH^{p) = 4 / ip"\p{x)]f{x)dx, (76) 
Jn 

where the factor of 4 here is purely conventional. The concavity of (f then implies that 
- AfH^ip) > 0. (77) 
In particular, if we chose / to be dip, where di = d/d6\ then we have 

/\9j>H^{p) = 4 / ^"\p{x\e)] {dip{x\e)f dx. (78) 
Jr 

Thus far, we have not specified the form of the function (/?, except for the 
requirements of concavity and twice differentiability. As a special case, let us consider 
the one-parameter family of concave functions 

^^{z) = {a-l)-\z-z^), (79) 
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where a > 0. This determines a one-parameter family of entropies Ha{p) given by 

= ^— f 1 - / p"(x)dx^ . (80) 



a — ^ ^ 

Note that when a = 1 we have '^i{z) = —zlnz and hence 

Hi{p) = — j p{x)hip{x)(ix. (81) 

In other words, we recover the expression for the famihar Shannon- Wiener entropy in 
the hmit a ^ 1. In the general case, the expression ( l80|) defines the Havrda-Charvdt 
entropy (also known as the a-order entropy), which is related to the well-known Renyi 
entropy Raijp) as follows: 

Rc.{p) = T^ln fl + (1 - a)H^{p)]. (82) 
1 — a V / 

In particular, Ra is monotonic in Ha- 

Also, choosing ip^ as in fl79|) we find that 

ds^ = /\,^^H^{p) = -- d^H^ip; d,p, d,p)de'de^ (83) 
is positive definite and defines a Riemannian metric. We can summarise this as follows. 
Proposition 3 (Burbea-Rao metric) The coefficients of the differential metric 

dsl = Gjf d^M^^' (84) 
associated with the Hessian of the a-order entropy (1801) are 

G^f = j p''{dMp){dj\np)dx. (85) 

In particular, when a = 1, G^^^ reduces to the Fisher-Rao metric. 

Proof. The expression in ([HSD follows at once from fl551) by virtue of the relation 
^l{p) = -ap"~\ □ 

We find, therefore, that the so-called a-order entropy metric is closely related to the 
Fisher- Rao geometry of the statistical manifold. In addition, there is another significant 
relationship between the derivative of the entropy and the a-order Fisher information 
matrix. This can be established as follows. From the defining equation ( IHOl) we have 

didjHaip) = -a / p"~'^{dip){djp)dx ^ / p°'''^didjpdx, (86) 

Jr a - 1 Jr 

and therefore we deduce that 

G[f = --d.d^H^ip) + [ p"-%d,pdx. (87) 

a a - 1 

For the canonical distribution (1201) . the limit a ^ 1 of this relation yields the Shannon- 
Wiener entropy 

H^(p) = ^9' j p{x\e)Hi{x)dx + ipiO). (88) 

In other words, the thermodynamic potential and the entropy are related by a Legendre 
transformation. Consequently, the Fisher-Rao geometry and the geometry arising from 
the Hessian of the Shannon- Wiener entropy are related by the general theory of Legendre 
transforms. 
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2. Classical ideal gas 

We shall now characterise the geometry of the statistical manifold that arises from 
the equilibrium distribution of a gas of noninteracting particles. Although this system 
displays no phase transition, the analysis presented here will provide an enlightening 
contrast with the results of Section [3] where we shall examine the geometry of the van 
der Waals gas, which does exhibit a liquid-vapour transition. 



2.1. Partition function in P-T distribution 

To elucidate the geometrical representation of gaseous systems in statistical mechanics, 
we begin our analysis with a system of noninteracting identical particles in the absence 
of potential energy. Physically, this system corresponds to a classical ideal gas immersed 
in a heat bath. As we shall show below, not only the Riemann curvature but also the 
geodesic equations for this system can be solved exactly. We consider, in particular, a 
pressure-temperature {P-T) distribution (also known as the Boguslavski distribution) 
of the form 

fiT^/\ m exp {-(3H - aV) 

p{H, V\a, (3) = — — (89) 

Z{a,fJ) 

defined on the phase space F of a system of noninteracting particles. Here the partition 
function Z(a,/5) is determined by the phase-space and volume integral 

Z{a, (3) = exp i-PH) dqdp^ exp (-aV) dV. (90) 

As usual, we have P = l/ksT, a = P/ksT, where P denotes the pressure, h the Planck 
constant, and the number of particles. The Hamiltonian H is just the free particle 
kinetic energy 

N 2 

H = Y^. (91) 

i=l 

Thus, we consider a closed system of noninteracting gas molecules immersed in a heat 
bath at inverse temperature jS and effective pressure a. Since the system is in contact 
with a bath at fixed temperature and pressure, the system energy and volume fluctuate. 
In thermal equilibrium, the distribution of these variables is determined by (l89ll . For 
a real gas, the constituent particles inevitably interact. Nevertheless, the ideal gas 
represented by the distribution (15^ adequately characterises the properties of a real gas 
at high temperature or low density, where the effects of inter-particle interactions can 
be neglected. 

Comparing (l89l) and (120|) we observe that the thermodynamic potential is given 
by ilj{a,(3) = In Z{a,j3). Therefore, to determine the Fisher-Rao metric (123!) we must 
perform the integration (l90l) . Noting the fact that each q-integration in (l90l) gives the 
volume V of the system, one obtains the partition function 

^K/?)=f|^) (92) 
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This follows from the fact that 

exp i-PH) dqdp = / / e-^^-i ^ J] 11 (^3) 
r -^q V-^p i=i J i=i 

is just a product of Gaussian integrals, and the identity 

_L^"v^^e-"^d\/ = «-(^+^) (94) 

that holds for a > 0. 

Note that the partition function z{j3, V) in the canonical ensemble is 



V) = ^ exp i-/3H) dqdp 



1 f2'Km\^^^'^ f,T , , 

V^^, (95) 



from which one can calculate the Helmholtz free energy 

F{(3,V) = -kBT\nz{P,V) (96) 
and thus obtain the equation of state 

P = - ( ^ ) = NkeT/V (97) 



dV J 



/3 



satisfied by a classical ideal gas. 



2.2. Curvature and geodesies for ideal gas 



The expression (1921) for the partition function clearly shows that the Riemannian 
geometry of the statistical model VJl associated with the classical ideal gas depends upon 
the number N of particles. Although finite size effects in small systems are sometimes of 
interest, here we are primarily concerned with the geometry that arises in the so-called 
thermodynamic limit oo. Thus, we consider the thermodynamic potential ip{a,(3) 
per particle in the thermodynamic limit, given by 

^(a, j3) = lim N-^ In Z(a, /3) = - In — — - In a. (98) 

AT^OO 2 h^jj 

The components of the Fisher- Rao metric, with respect to the parameterisation {a, (3), 
can then be calculated by differentiation, with the result 

o, = ( ) . (99) 

From this expression we deduce the following. 

Proposition 4 All components of the Riemann tensor, and consequently also the scalar 
curvature, of the statistical manifold 9Jl associated with the classical ideal gas vanish, 
and thus the manifold is flat. 
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Proof. From the components of the metric fl99p one can calculate the components of 
the Christoffel symbol F*^ and the Riemann tensor R^j^i using the definitions (|2^ and 
( 1371) . Alternatively, to show that this manifold is fiat, it suffices to display a change of 
coordinates which transforms the metric (p9l) into a Euclidean metric. Here, we adopt 
the latter approach because this also permits more expeditious solution of the geodesic 
equations. We recall that under a coordinate transformation a;* —>■ the metric of a 
Riemannian manifold transforms in the usual tensorial manner, so that the components 
of the inverse metric in the new coordinate system are determined by the contraction 

0"-0^'%%. (100, 
Consider the following coordinate transformation 

a-^a' = lna and [3 ^ [3' = ^J\\n(3. (101) 

A straightforward calculation then shows that the components of the inverse metric in 
the (a', (3') coordinate system are 

= ( 1 ) ' 

and thus the manifold is indeed flat. □ 
Since the statistical manifold associated with the ideal gas is flat, solution of the 
geodesic equations is straightforward. The result can be summarised as follows. 

Proposition 5 The geodesic curves on the statistical manifold 3Jl associated with the 
classical ideal gas are given by 

where Pq, Tq, and c are integration constants. In particular, the geodesies include the 
adiabatic equation of state for the ideal gas, corresponding to the choice c = —Cy/NkB, 
where Cy is the constant-volume heat capacity. 

Proof. The geodesic equations for the variables a and (3 assume identical forms, i.e. 
d^r 1 /dr\2 

^^^^^ (104) 



ds^ x\ds/ 
for X = a, (3. This can be rewritten as 

^'■'(S)-s-' 

from which we see that the general solution is x{s) = cie'^^'^. Thus, we obtain 

J^ = c^e^os J_ = c3e''' (106) 

keT ksT ^ ^ ' 

as the general solution to the geodesic equations. Combining these two equations, we 
have 

P - {kBTf~'°^'' . (107) 



Co/c2 



C3 



Setting s = we find Ci = Pq/^b^o and C3 = l/ksTo, which yields at once the expression 

in ([insD. □ 
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3. Van der Waals gas 

The geometry of the statistical manifold changes considerably if the gas molecules 
interact. In particular, if the system exhibits a phase transition, then the curvature 
tends to become singular at the transition point. This property seems to be universal and 
appears in many systems exhibiting critical phenomena. The van der Waals gas model 
is not only of physical interest, but also illustrates many of the universal geometrical 
features of the associated manifold of equilibrium states. 



3.1. Equation of state 

The idealised system of noninteracting particles considered above is inadequate for the 
description of phase transition phenomena, that is, the condensation of gas molecules. 
Here we shall extend the model to include inter-particle interactions, which leads to the 
van der Waals equation of state: 

P + a—)iV-bN) = NkBT, (108) 



where N is the total number of molecules and a, b are constants determined by the 
properties of the molecule. The liquid-vapour transition occurs at the critical point 
where the temperature T, pressure P, and volume V simultaneously assume the values 

" 14 = 3bN, and = (109) 



2762' " ' " 27kBb 

The critical point is determined by the simultaneous solution of the equations 
dP d^P 

_ = aud ^=0. (110) 

Using the dimensionless variables 

P . V T 

the equation of state can be rewritten in the universal form 

P + W-^\ (V = \f, (112) 



independent of the parameters a and b. In Figure [3] we plot the pressure P as a function 
of the volume V for T > 1, T = 1, and T < 1. 

Note that the positivity of the pressure implies a bound on the temperature. In 
particular, from ( I112p we deduce that the condition P > is equivalent to the bound: 

f>^, (113) 
" 8^2 ' 

in terms of the dimensionless volume V . If we demand the positivity of P for all volumes 
V >\, then we must require T > ||, or equivalently 

27 

T > -T,. (114) 
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Figure 3. Equations of state for the van der Waals gas in terms of dimensionless 
variables. The isothermal curves correspond to T = 1.4, T = 1, T = 27/32 (maximum 
superheating temperature) and T ~ 0.6. Note that the isothermal curves associated 
with temperatures below allow metastable regions for which P < 0. 



The temperature 

Tn, = ^T, ^ O.SST, (115) 

is known as the temperature of maximum superheating and is related to the nucleation 
of bubbles when the liquid is heated very abruptly. In particular, ii T < then the 
liquid can be contained at low external pressure, whereas if T > the liquid cannot 
exist under low external pressure and thus evaporates. Experimental data show, for 
example, that = 0.89Tc for ether, = 0.92Tc for alcohol, and = 0.84Tc for 
water, indicating the fairly accurate predictive power of the van der Waals equations of 
state. 

Turning to the equations of state, the pressure P as a function of the volume V 
has three distinct roots when the temperature is below its critical value T^. Amongst 
these three roots, the intermediate root corresponds to a point at which {dP/dV)T > 
0. Hence, this root is unstable, since the pressure increases with volume for fixed 
temperature. It follows that one of the remaining two roots should correspond to thermal 
equilibrium. To ascertain which of the two roots is stable, we recall that the condition 
for stability is determined by the minimisation of the free energy. If we let G{T, P) 
denote the Gibbs free energy, then Maxwell's relation V = {dG/dP)T implies that 

G{T,P) = G{T,Po)+ [ V{u,T)du. (116) 

JPo 

Therefore, when viewed as a function of pressure P for a fixed temperature T < Tc 
below the critical point, the free energy G{T, P) describes one of two distinct curves. 




Figure 4. Isothermal curve and equal area law in the pressure- volume plane (above) , 
and pressure dependence of the Gibbs free energy (below) . As the pressure P of the gas 
is slowly increased, the Gibbs free energy G(P) increases along the thick solid line in 
the lower diagram, until P reaches the coexisting pressure P2- The gas then undergoes 
a phase transition and condenses. During this transition the volume changes from 
Vy to V;, at which point the entire system enters the liquid phase. The value of the 
coexisting pressure P2 is determined by Maxwell's equal area law. 



depending on whether P is reduced from high values or increased from low values. This 
is shown schematically in Figure HI 
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Figure 5. Lennard- Jones Potential. There is a strong repulsive force at short distances 
r < ro and a weak attractive force at long distances r > tq, where = \/2d and d 
represents the radius of the gas molecule. 



If the free energy assumes its minimum value, then as the value of P changes, 
G{T,P) must describe one of the two curves in Figure H] until its intersection with the 
other curve at pressure P = P2, whereafter G{T, P) follows the other curve. At the point 
P = P2 the liquid and vapour phases coexist. Therefore, if we, say, reduce the pressure 
from high values, then after reaching the value P2 the pressure remains constant until the 
liquid is entirely converted into vapour. During this transition the volume changes from 
Vi to K as indicated in Figure HI The value of the coexisting pressure P2 is determined, 
for each fixed T < Tc, by Maxwell's equal area principle. That is, the vertical line in 
Figure H] is chosen so that the volumes of the two shaded regions are exactly equal. 



3.2. Canonical partition function 



The equation of state f llOSp was first deduced empirically by van der Waals, directly 
from experimental observations. However, it can also be derived analytically from the 
canonical partition function associated with an empirically postulated intermolecular 
potential. Assume that the interaction energy between a pair of molecules separated by 
a distance r is given by the Lennard- Jones potential 



0(r) = 400 



r J 



9o 



r / V r 

where tq = 2^^^d and d is a parameter which can be regarded as the radius of the gas 
molecule. Clearly, (f){d) = and 0(r) assumes its minimum value at r = tq. As we 



117) 
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see from Figure \5\, this inter-molecular potential energy gives rise to a weak long-range 
attractive force and a strong short-range repulsive force between each pair of molecules. 
The canonical partition function can thus be written as 

TV „ „ 2 

-(/^' y) = n / d^p^ / d^^. -p ( - E - E ^^^) ' (i^^) 

where (pij = 4>{rij), with r^j denoting the distance between the ith and jth molecules. 
Thus, the canonical partition function can be expressed as a product 

ziP,V)=zoiP,V)QiP,V), (119) 

where zo{P, V) is the canonical partition function for the ideal gas ( 195|) and 

Qif^^y) = yN /dVi--- /"d%exp(-/3^0,,) (120) 

is the contribution from the interaction energy. 

Now, as an approximation to the Lennard- Jones potential we assume that 
exp(— /30ij) = for rij < d. In other words, we regard the molecules as hard spheres 
of radius d, which cannot overlap. As a consequence, the overlapping region can be 
removed from the range of the volume integration (11201) . Defining the so-called Mayer 
function fij = f{rij) by 

= exp(-/50,,) - 1, (121) 

we rewrite the integral (I120p as 

Q((^^ y) = VN [ ■ • ■ / d'r^ 11(1 + /,,) 

^ Jri>d JrM>d ^.j-^ 



1 

yN 



i+E/^^ + EE/^^/^^ + --- • (122) 



Assuming that the parameter 0o in (I117P is sufficiently small, the contribution arising 
from fij in the specified integration range can be regarded as an infinitesimal. The first 
term on the right side of (11221) . i.e. the integral of unity, can, on the other hand, be 
approximated by 

V{V -vo)---[V~{N- 2)vo] [V-{N- l)vo] ^ {V ~ bNf , (123) 

where we put Vq = 2b = ^nd^. The integrations are performed consecutively, so that the 
first particle can occupy volume V without constraints, the second particle can occupy 
volume V less the volume vq occupied by the first particle, the third particle can occupy 
volume V less the volume 2vq occupied by the first two particles, and so on. Similarly, 
the second term on the right side of (11221) can be approximated by 



d^n ■ ■ • / d^r^v/., = {V- bNf-' \ d'r.f,, 

ri>d J Tf^>d Jrj>d 



{V-bNf'^Pn I 0(r)rMr. (124) 



oo 
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Assembling these results, we can approximate V) in the following closed form 



Q{f3,V)^ (l-byj \^l+f3^ + 



I -by exp(/3-^), (125) 



oo 

2, 



where we have defined 

a = — TT / r^(f){r)dr. (126) 
Jd 

Using the above expression for Q{(3, V) we finally obtain the canonical partition function 

1 /27rm\^^/^ 

z{f3^V) = ^ 1=^) {V -bNfexp{a(3NyV). (127) 



3. 3. Critical behaviour of the van der Waals gas 

From the expression for the partition function of the canonical distribution we deduce 
the equation of state 

I ( d\nz{(3,V) \ NksT 

^-p [ dv ) , - ~ ^ V^' ^^^^^ 

where for clarity we have substituted P = l/ksT. Observe that this is precisely the van 
der Waals equation introduced in (11081) . If we had not applied various approximations 
in the derivation of (11271) . then additional terms of order (N/V)^ and higher would have 
appeared on the right side of (11281) . 

To analyse the behaviour of the system near a critical point we introduce the 
deviation parameters 

p = P-l, v = V-l, and t = f-l. (129) 

In terms of these shifted variables the equation of state ( 1128^ becomes 
8(t+l) 3 
3v + 2 " (v + l) 

We can then expand the equation of state (11301) for small v and t, obtaining 



P = ^T^-7:r^-i. (130) 



p = t{4-6v)-p^ + ■■■. (131) 
Similarly the Gibbs free energy 

G = PV- keT In z{T, V) (132) 
can be expanded as follows: 

G = [ip - ^t)v + ?,tv^ + ft;^ + 1 + p] - iNksT In ^ ^^^^^^ ^ . (I33) 

For fixed pressure p and temperature t, the volume v in thermal equilibrium is that 
which minimises the Gibbs free energy G. The equation of state (I13ip is a necessary 
but not sufficient condition for the Gibbs free energy to assume its minimum. Therefore, 
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at the coexisting pressure p = 4t below the critical temperature {t < 0) we have the 
three roots 

V = ±2v^, (134) 

for the volume determined by the equation of state (113 ip . Differentiating (11331) with 
respect to v, we find that the first two roots ±2y/^ minimise the free energy G and 
thus correspond to stable states, while the root v = maximises G and thus corresponds 
to an unstable state. Stable states represent the coexisting phase of liquid and vapour, 
with pressure given by 

P2 = Pc{l + 4t) = Pc(^l- 4^^^) • (135) 

The liquid phase is more stable when P2 < P < Pc, and the vapour phase more stable 
when P < P2. 



3.4- The thermodynamic limit 

The existence of the instability in the van der Waals system studied above is related 
to the fact that in the canonical distribution the volume V of the system is held fixed, 
whereas in a real gas volume fluctuations are significant in the vicinity of the critical 
point. In other words, the canonical distribution does not provide a completely accurate 
physical description of the vapour-liquid equilibrium. Therefore, as in the case of an 
ideal gas, we consider the pressure-temperature distribution, with the corresponding 
partition function 

1 r°° 

Z{a,p) = - z{P,V)exp{-aV)dV (136) 

JbN 

wherein the volume fluctuation is integrated out. Recall that b = ^nd^ represents the 
smallest volume each molecule can occupy. Hence, the random variable V representing 
the total volume ranges from bN to infinity. When the canonical partition function 
(11271) is substituted into (11361) . the resulting integral does not admit an elementary 
analytical expression. Nevertheless, in the thermodynamic limit N —>■ 00 we can 
implicitly determine the potential ■?/'(«,/?) = N~^\ia Z{a, P) by the method of steepest 
descent. 

We proceed as follows. First we write the integrand in (11361) as 

exp{-aV)z{P, V) = exp[Ng{a, P, {))], (137) 
where v = V/N and 



g{a, P,v) = 1 — av + In 



) [v-b) 



+ ^. (138) 

V 



f3h^ 

In deriving (I138p we have used the Stirling formula In A^! ~ In — A^. It should be 
evident from (11320 that 

G = -f3~^g{a,(3,v) (139) 
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is the Gibbs free energy. Also, note that g{a,f3,v) must have at least one maximum 
in the range v G [b, oo) corresponding to the minimum Gibbs free energy, because 
g{a,(3,v) — oo in the limits v ^ b and v oo. The value of v at which 
g{a, f3, v) is maximised therefore determines the equation of state (11281) for the canonical 
distribution. However, in the P-T distribution the volume is a random variable, hence 
we must take its expectation to obtain the equation of state: 

where (v) denotes the expected volume per particle in the P-T distribution characterised 
by the density function Z~^{a, P) exp[Ng{a, P,v)]. 

Applying the change of variable V ^ v, the partition function fll36p can be written 
in the form 

N f°° 

Z{a, f^) = jj^ exp [Ng{a, (3, v)] dv. (141) 

Recall that we are interested in the thermodynamic potential per particle in the 
thermodynamic limit: 

tlj{a,(3) = Mm N'HnZ{a,(3). (142) 

A''^oo 

Using the method of steepest descent, we find that tp{a,l3) in this limit is given by 

ip{a, P) = g{a, P, v) = —av + \nz{P, v), (143) 

where v = v{a,P) is the function of a and P which maximises g{a,P,v). Since v 
minimises the Gibbs free energy, it is the solution of the van der Waals equation of 
state. Although the exact functional form of v{a, P) is not at our disposal owing to the 
cubic nature of the equation of state, we can nonetheless determine the exact expression 
for the scalar curvature in terms of the variables P and v. Before we proceed, however, 
we first establish the following result. 

Proposition 6 The thermal expectation value of the volume v per particle in the P-T 
distribution is given by v, that is, (v) = v. 

Proof. Differentiating (11431) and using the chain rule we find 

dijj ^d\nz\dv (144) 

da V dv J da 

However, by definition v maximises g{a,P,v) so that 

^9 dlnz ^ 

and hence 

On the other hand, from (I140p we have (-O) = —dip /da, and thus (v) = v. □ 
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3.5. Geometry of the van der Waals manifold 



As we have just indicated, the functional form of v{a,j3) is unknown. Nevertheless, 
we can implicitly determine expressions for dv/da, dv/d(3, and so on, in the following 
manner. First, define fl by 
dg 



n = — 



1 ^« 

V — V 



f,- - - h ■ '-2- (147) 

OV V — f ^ 

Since V = V maximises g{a, (3, v) we have by definition the relation = 0. This, however, 
is just the equation of state for the van der Waals gas. Now, consider the total derivative 





jdVL' 




'dVL 






dv 


da — 


'dp/ 


dv 



dp 



dn^ dn^^ dn^ 

dn = —da + —dp + —dv. 
da dp dv 

Since dfl = it follows that 
dv = — 

^da^ 13^^ ~^ \dp) a 
where we have used the general identity: 

rda\ /dp\ /dT\ ^ _^ 

\dp)^\d-i)c\da) p 
On the other hand, from fll47p we have the relations 



:i48) 



/dv\ 



dp, 



dn _ ^ 

da 



dfl a ^ dQ 

dp v^ ' dv 



{v - 6)2 

Therefore, substituting these into (11491) we deduce that 

dv 1 A ^ ^ 

d^^D dp^D^' 



2ap 

y3 



(149) 



(150) 



(151) 



(152) 



2ap 

y3 



(153) 



where D is defined by 

- iv-by 

The derivatives of v with respect to the parameters a and P are required in order 
to determine the components of the Fisher-Rao metric on the van der Waals manifold. 
Specifically we obtain the following result: 

Proposition 7 In terms of the pressure-temperature coordinates {a,P) the Fisher-Rao 
metric on the van der Waals manifold is given by 



-1 

-ajv^ 



In particular, in the ideal gas limit a 
metric ( l99l) for the ideal gas. 



-a/v"^ 
and b 



1/5-2 _ («/^2)2 



(154) 

0, the metric fll54p reduces to the 
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Figure 6. Schematic illustration of equations of state for van der Waals gas. The 
scalar curvature on the parameter space diverges along the spinodal boundary which 
envelopes the unphysical region. The critical point is that where the spinodal curve is 
tangent to the Maxwell equal area boundary. The divergence of the curvature along 
the spinodal boundary may be interpreted as 'preventing', in some sense, entry into 
the unphysical domain in the phase diagram. 



Proof. The components of the metric are determined by the matrix didjip{a, (3). 
We have, in Proposition [6l estabhshed that dip /da = —v, and, using (11451) .we have 
dip dv dlnzdv dliaz dlnz 

dp " '"^dp ^ "diTdp ^ ~~d(r " ~dir' ^ ^ 

Therefore, we obtain 

d'^ip dv d'^ip dv d'^ip d"^ \nz d'^lnz dv 

d^^~d^' 'Mp^'dp' W ^ ~dp^ ^ ~dpmdp' ^^^^^ 

whence the desired expression for the metric follows from the formula (I127P for the 
canonical partition function. In the ideal gas limit a — and 6 — 0, we have D ^ v. 
However, from the ideal gas equation of state we have v = a~^, hence we recover (p9l) 
at once in this limit. □ 
To describe the geometry of the van der Waals gas it will be convenient to introduce 
the concept of a spinodal curve. In general a spinodal curve consists of the points in 
the thermodynamic phase space at which the second derivative of the free energy with 
respect to an order parameter vanishes. For a gas of interacting molecules, the mean 
volume V constitutes the order parameter of the system, and the vanishing of the second 
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derivative of In z{(3, v) with respect to v thus determines the spinodal curve in the phase 
diagram. For the van der Waals gas, by fll45p . we have the relation 



that determines the spinodal curve. In other words, the locus of points at which the 
derivative of pressure with respect to volume vanishes for some temperature determines 
the spinodal curve. This is schematically illustrated for the van der Waals equation in 
Figure [6l 

It is evident from Figure [6] that the region in the phase diagram enveloped by 
the spinodal curve is unstable because {dP/dV)T > in this region, i.e. the pressure 
increases with increasing volume. The spinodal curve thus forms the boundary of a 
semi-stable region in the phase diagram. In view of the first relation in (11521) . the 
spinodal curve is determined by the condition D = 0. On the other hand, from the 
expression (I154p for the Fisher-Rao metric on the van der Waals manifold we see that 
each component of the metric Gij as well as its determinant — 3/(2/?^D) is singular 
along the spinodal curve. Is this singular behaviour merely due to the specific choice of 
coordinates or is it an intrinsic feature of the van der Waals manifold? We can answer 
this question by calculating the scalar curvature of the manifold. The exact expression 
for the curvature is as follows. 

Proposition 8 The scalar curvature of the van der Waals manifold is given by 



In particular, R diverges along the entire spinodal curve specified by D = 0, which 
includes the critical point {Pc,Vc,Tc) . The scalar curvature vanishes in the ideal gas 
limit for which a ^ and b 0. 

Proof. Since we have chosen the canonical parametrisation [a, (3), we can use 
the determinant in fHSl) to calculate the curvature. To compute the entries in the 
determinant we differentiate (11561) and use the chain rule, together with (I152p . thereby 
obtaining 



Substituting these results into (jUD we obtain, after some algebra, the desired expression 
in (I158p . The fact that the curvature diverges along the spinodal curve D = is evident 
from the expression (I158p . Also, from the definition (11571) of the spinodal curve and 
the condition (IllOp for the critical point, it is clear that the critical point lies on the 
spinodal curve. The vanishing of the curvature in the ideal gas limit a, 6 — > is also 
evident from the expression in (11581) . □ 
The van der Waals manifold possesses the structure of a Riemann surface over a 
planar base space with coordinates (a, /3), branched around the singularities specified by 




(158) 




(159) 




Figure 7. Geometric ptiase diagram for the van der Waals gas. The gas is divided 
into positive and negative curvature phases by the vanishing curvature R = curve. 
The change of phase along i? = is analytic, while the curvature exhibits singular 
behaviour along the spinodal curve D — 0. 

the spinodal curve. Now, suppose we slowly change the variables along a closed 

contour C in the planar base space. Then, the lifted curve in the van der Waals manifold 
does not, in general, return to the same sheet (i.e. to the same thermodynamic state) if C 
encloses the critical point or crosses the spinodal curve, while Maxwell's relation ensures 
that an infinitesimal closed contour enclosing no singularities is thermodynamically 
trivial. Thus, the presence of singularities may give rise to changes in the thermodynamic 
state V of the system upon following a closed contour in the parameter base space which 
encloses a point of divergency. Conversely, if a closed contour in the parameter space 
does not enclose the critical point or cross the spinodal curve, then the corresponding 
curve in DJt is closed and thus gives rise to a well-defined holonomy. This leads naturally 
to the following open problem: What is the physical interpretation or relevance of the 
holonomy (analogue of the geometric phase in quantum mechanics) in classical statistical 
mechanics? 

Finally we note that the scalar curvature R on the van der Waals manifold vanishes 
along the curve specified by 
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and its sign changes smoothly from positive to negative as one decreases the temperature 
in the plane. The sign of the scalar curvature in the {v,l3) plane and its relation 

to the spinodal and Maxwell boundaries are schematically illustrated in Figure [71 One 
might refer to the smooth change in the sign of the scalar curvature in the phase diagram 
as a geometric phase transition. Unlike the conventional phase transitions associated 
with singular behaviour, however, such geometric phase transitions are not associated 
with any divergence. There are indications that the scalar curvature can be viewed 
as a measure of stability of the system. However, the precise correspondence between 
physical characteristics of the system and properties of the curvature, apart from the 
singular behaviour along the spinodal boundary, remains an open research problem. 

4. Bibliographical notes 

When the first author was invited to write a survey article on a topic of interest, it 
seemed appropriate to utilise this opportunity by briefly reviewing some applications 
of information geometry in physics. This reflects the fact that interest in this area 
has continued to grow. Two international conferences on applications of information 
geometry have been held in recent years, while diverse new applications continue to 
emerge — for example, in the area of shape recognition in computer science (Maybank 
2005, Peter & Rangarajan 2006), in connection with out of equilibrium measures 
(Crooks 2007), in the characterisation of quantum phase transitions (Zanardi et al. 
2007), in various mathematical extensions (e.g., Cena & Pistone 2007), or in black 
hole thermodynamics (Ruppeiner 2008). The application of information geometry to 
statistical physics, however, has generated a vast amount of literature, and it seemed 
neither feasible nor helpful to cover all the results which have emerged in this area. What 
seemed more appropriate was to focus attention upon one specific topic that nevertheless 
incorporates all the essential ingredients. It also appeared desirable to explain the 
basic ideas of information geometry and its emergence from probabilistic and statistical 
considerations in a language accessible to a graduate student in theoretical physics. For 
these reasons we begin the paper with a some rather elementary background material, 
and then consider its application to the theory of vapour-liquid equilibrium. This leads 
to the analysis of the van der Waals gas model, which, in our opinion, not only embodies 
all the essential features of a phase transition in statistical mechanics but also admits 
an elegant geometric characterisation. To keep the exposition at a fairly elementary 
level, we have excluded ad hoc citations from the main text so as to avoid impeding 
the continuity of the exposition. Instead, references to the literature are presented more 
informally in these bibliographical notes. 

To the authors' best knowledge, the use of geometric methods in statistical analysis 
was first introduced by P. C. Mahalanobis, the founder of the Indian Statistical Institute 
and also the founding editor of Sankhya (The Indian Journal of Statistics), in the early 
1930's (Mahalanobis 1930, 1936). Mahalanobis was a physicist and statistician who 
taught relativity and wrote an introduction to the translation by M. Saha of Minkowski's 



Information geometry in vapour-liquid equilibrium 



31 



work on relativity. His articles on relativity, written jointly with S. N. Bose were 
published by the Calcutta University. He introduced a measure of mutual separation in 
the study of statistical data arising from anthropometric measurements. An alternative 
measure of separation was subsequently introduced by Bhattacharyya (1943, 1946), and 
was defined in Section [1] as the Bhattacharyya spherical distance between two probability 
densities. 

The geometric description of the parameter-space manifold was initiated at around 
the same time by Rao (1945, 1947, 1954). The seminal paper by Rao (1945) is 
significant in two respects: on the one hand, it introduced the so-called Cramer-Rao 
inequality as a lower bound for the variance, while on the other hand it pointed out that 
the information measure introduced previously by Fisher (1925) defines a Riemannian 
metric on the parameter-space manifold of a statistical model. Rao then proposed 
the associated geodesic distance as a measure of dissimilarity between probability 
distributions. The formulation considered by Rao was based on the Hilbert space 
embedding pe{x) — > le^x) = y/pe^x). As a consequence, many of the constructions in Rao 
(1945) bear a formal resemblance to the geometric formulation of quantum mechanics, 
developed by physicists some time later in the 1980's and 1990's. Concurrently with 
Rao's work on the application of geometry in statistics, Jeffreys (1946) also introduced 
the concept of uninformative priors based on the use of a Riemannian metric. 

It is worth noting incidentally that the inner product of probability measures via 
the square-root embedding was introduced earlier by Hellinger (1909) in connection 
with unitary invariants of self-adjoint operators in Hilbert space. The distance measure 
of Hellinger was subsequently extended by von Neumann and Kakutani (see Kakutani 
1946), who introduced an inner product of probability measures in an abstract measure- 
theoretic context and applied this to investigate equivalence and orthogonality relations 
between product measures. The Kakutani inner product was used by Brody (1971) to 
provide a simple proof of the Gaussian dichotomy theorem, and has also been extended 
by Bures (1969) in the context of operator algebras. 

Interest in the application of geometrical techniques to statistical inference appear 
to have somewhat diminished subsequently, but reemerged with the appearance of 
Efron's seminal paper (1975) on information loss and statistical curvature. Efron 
considered the logarithmic embedding pei^x) le{x) = In pg{x) and demonstrated that 
in the space of log-likelihood density functions the curvature of the curve Iq measures 
the deviation of the density function from the exponential family. Furthermore, the 
squared statistical curvature was shown to determine the loss of information resulting 
from the use of the maximum likelihood estimator for the unknown parameter 6. 

The work of Efron (1975, 1978) — and to some extent that of Cencov (1982) who 
showed that the Fisher information metric is unique under certain assumptions including 
invariance — evoked considerable interest in the geometrical approach to asymptotic 
inference and related topics in statistics. Numerous research papers (for example, 
Atkinson & Mitchell 1981), as well as review papers (for example, Kass 1989) and 
monographs (for example, Amari 1985, Amari et al. 1987, Murray & Rice 1993, and 
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Amari & Nagaoka 2000), were subsequently published. 

In parallel with these developments in statistics, the application of information 
geometry to the description of the equilibrium properties of thermodynamic systems 
was considered by a number of authors. One of the initiators was Ingarden (1981), 
who considered various Banach space embeddings of probability density functions and 
their relation to the concepts of entropy and Fisher information, and suggested their 
application to statistical physics. The works of Ingarden and his collaborators led to 
the establishment of a Polish School of researchers systematically investigating various 
geometric aspects of physical systems described by equilibrium distributions. Janyszek 
& Mrugala (1989a), for instance, clarified the relation of the Fisher- Rao geometry to 
contact geometry (the latter characterises the geometry of Legendre transformations), 
and investigated the physical interpretation of the metric tensor for a system of 
gas molecules characterised by the pressure-temperature distribution. Janyszek & 
Mrugala (1989b) calculated the scalar curvatures of the parameter space manifolds of 
the one-dimensional Ising model in the thermodynamic limit and of the mean-field 
model. Janyszek & Mrugala (1990) also extended information geometric analysis to the 
investigation of the stability of ideal quantum gases. Some of these ideas were further 
extended by others in the context of various spin models in statistical mechanics (see, 
for example, Janke et al. 2002, Brody & Ritz 2003, Johnston et al. 2003). 

An independent line of investigation on the various geometric properties of 
thermodynamic state spaces was initiated by Weinhold (1975) and also by Ruppeiner 
(1979). Weinhold proposed the existence of a metric structure for the thermodynamic 
state space arising from empirical laws of thermodynamics. This line of thinking, which 
led to the notion of the so-called thermodynamic length, was extended in a variety of 
ways by various authors (see, for example, Salamon & Berry 1983, Schlogl 1985, Mrugala 
et al. 1990). 

Ruppeiner went a step further and considered the second derivative of the entropy, 
discussed briefly here in Section 11.51 as a Riemannian metric on the thermodynamic 
state space. The metric considered by Ruppeiner, based upon the Shannon- Wiener 
entropy, agrees with Rao's entropy derivative metric (cf. Rao 1984), and is also related 
to the Fisher-Rao metric through a Legendre transformation. The key idea is that the 
convexity of entropy implies the positive-definiteness of the associated Hessian matrix, 
which can therefore be used to define a Riemannian metric on the thermodynamic 
state space. The simplest system to consider in this respect is naturally that of a 
noninteracting gas of classical particles. This was investigated by Ruppeiner (1979), and 
also by Mijatovic et al. (1987). Geometric aspects of various other physical systems have 
also been investigated along these lines (Ruppeiner 1990, 1991). For a comprehensive 
bibliography on this topic, see the reference list in the review article by Ruppeiner 
(1995). 

Ingarden & Tamassy (1993) also applied an entropic measure of divergence to 
explain the thermodynamic arrow of time. While the infinitesimal form of the 
entropic measure of divergence (relative entropy) gives rise to a Riemannian structure 
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characterised in general by the Burbea-Rao metric fl85l) . the metrics arising from 
entropies generally possess Finslerian structure. In particular, the Finsler metric arising 
from relative entropy is not symmetric. The idea of Ingarden & Tamassy (1993) consists 
in exploiting the lack of symmetry in such metrics to explain the thermodynamic arrow 
of time, without the introduction of dissipation. 

An important application of information geometry to the properties of 
renormalisation group flow in statistical physics — motivated in part by the observation 
of Dawid (1975) that Efron's results might be represented more concisely in terms of 
Hilbert space geometry — was proposed by Brody (1987). Closely related ideas were 
developed further by O'Connor & Stephens (1993), Dolan (1998), Brody & Ritz (1998), 
and Brody (2000). (See also Diosi et al. (1984) for an alternative approach to analysis 
of renormalisation group flow using Rao's entropy derivative metric.) 

The ideas of information geometry have also been extended to the quantum domain 
by replacing the density functions of classical probability theory by density matrices. 
Substantial work has been done on quantum information geometry (for example, Petz 
& Sudar 1996, Uhlmann 1996, Grasselh & Streater 2001, Petz 2002, Streater 2004, 
Jencova & Petz 2006, Gibilisco & Isola 2007, and Gibilisco et al. 2007), as well as its 
application to quantum statistical inference (Brody & Hughston 1998, Barndorff- Nielsen 
& Gill 2000). 

Turning more specifically to the subject matter of the present review, as indicated 
above, the spherical distance ([7]) representing the dissimilarity of probability densities 
was introduced by Bhattacharyya (1943), and the concept of a statistical manifold 
represented by the metric (123|1 was introduced by Rao (1945). The uniqueness of the 
Einstein metric on a complex projective space was conjectured by Calabi, and later 
proven by Yau (1977). The implication of this result in quantum mechanics — that the 
solution to the vacuum Einstein equation in the space of pure quantum states determines 
transition probabilities — was pointed out to the first author by G. W. Gibbons in the 
late 1990's. The relevance of the projective space and the associated metric (H7|) to 
statistical mechanics was demonstrated by Brody & Hughston (1999). 

The expression in Proposition [2] for the scalar curvature in terms of the determinant 
of a 3 X 3 matrix, valid for the two-dimensional statistical manifold associated with a 
canonical density function, is given in Janyszek & Mrugala (1989b). The fact that 
the statistical manifold associated with the normal density function possesses constant 
negative curvature, shown in equation (1571) . was observed by Amari (1982). However, 
the significance of the scalar curvature (or in fact the Riemann tensor itself) in statistical 
analysis remains somewhat obscure. 

A survey article by Burbea (1986) deals systematically with expressions for geodesic 
curves associated with a number of standard density functions used in statistics (see 
also the article by Rao in Amari et al. 1987). This includes, in particular, the distance 
between Gaussian density functions as given in fl69|) and flTTl) . Analogous results for 
gamma-distributed densities have been calculated in some detail by Burbea et al. (2002). 

The fact that the leading order term in the Taylor expansion of the relative entropy 
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of a neighbouring pair of parametric density functions gives rise to the Fisher- Rao metric 
was observed by Ingarden (1981). A more detailed and thorough analysis was given by 
Burbea & Rao (1984), and constitutes the basis for the discussion in Section [T75l The 
specific form of entropy defined in (IHOll . sometimes referred to as the a-order entropy, 
was introduced by Havrad & Charvat (1967) in the context of quantifying classification 
schemes. See also Burbea & Rao (1982a, 1982b) for details concerning various properties 
of this entropy. The use of the a-order entropy in statistical mechanics has been proposed 
by Tsallis (1988). 

In Section [2] we considered the information geometry of the pressure-temperature 
distribution representing the equilibrium state of a gas of noninteracting particles 
(classical ideal gas). The flatness of the information manifold of a classical ideal gas 
was pointed out by Ruppeiner (1979) using the entropy derivative metric. The solution 
to the associated geodesic equations, in the form expressed in Proposition [5l does not 
seem to appear elsewhere. (An alternative representation of the geodesies appears in 
Mijatovic et al. 1987.) 

In Sections I3.1N3.4I we have provided a brief account of the classical theory of the 
van der Waals g background for the subsequent geometric description. Our 

exposition follows closely the classic treatise of Mayer & Mayer (1940). There is a series 
of inspiring papers by Kac et al. (1963), Uhlenbeck et al. (1963), Hemmer et al. (1964), 
and also Hemmer (1964), analysing the vapour-liquid equilibrium of the van der Waals 
gas in great detail. These papers extend the earlier work of Kac (1959), which provides 
a method for determining the partition function of interacting gas molecules. 

Other related work on systems of interacting gas molecules includes the following: 
Tonks (1936) determined the equations of state for gases composed of hard elastic 
spheres with finite radius. Van Hove (1950) calculated the free energy of a system 
of molecules with nonvanishing incompressible radii, interacting according to a finite 
range force. He showed that in one dimension the system exhibits no phase transition. 
Lebowitz & Percus (1963) studied the properties of the correlation functions. Van 
Kampen (1964) showed that a gas of molecules with hard sphere repulsive forces and 
long-range attractive interactions exhibits condensation, and calculated the density 
fluctuations. Rigourous bounds for the free energy of the van der Waals gas were 
derived by Lebowitz & Penrose (1966). 

Detailed analyses of the curvature of some of these classical systems of interacting 
molecules were presented by Ruppeiner & Chance (1990). The geometry of the van der 
Waals gas associated with the entropy derivative metric is considered in papers by Diosi 
& Lukacs (1986) and in Diosi et al. (1989), wherein the authors determine the scalar 
curvature using the density and temperature as coordinates. Using these coordinates, 
they have also shown that on this statistical manifold there exists no solution to the 
Killing equations (i.e. no vector field such that the associated flow preserves geodesic 
distances) . 

The description of the geometry of the van der Waals manifold presented in 
Section [331 follows closely the analysis outlined by Janyszek (1990) and also by Brody 
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& Rivier (1995). In particular, the expressions for the metric tensor in Proposition [7] 
and for the scalar curvature in Proposition [8] were derived by Brody & Rivier (1995), 
who also suggested that the curvature of the statistical manifold might play a role in 
statistical mechanics analogous to that of geometric phases in quantum mechanics. This 
remains an open issue, although recent work on quantum phase transitions indicate that 
there is indeed a close analogy between these two concepts. 

Finally, the present authors regret that owing to the huge volume of literature on 
this subject there are many other valuable contributions which have not been mentioned 
in these brief biographical notes. 
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